Goto

Collaborating Authors

 number sense


A Small Math Model: Recasting Strategy Choice Theory in an LLM-Inspired Architecture

Rahman, Roussel, Shrager, Jeff

arXiv.org Artificial Intelligence

Strategy Choice Theory (SCT; Siegler and Shrager, 1984; Siegler, 2000) explains important aspects of children's arithmetic learning based upon principles including learning from developmentally naturalistic data, probabilistic representation, confidence-based retrieval, and the phase-like importance of scaffolding strategies, such as finger-counting. Here we recast SCT as a ``Small Math Model'' (SMM), employing a neural-network-based architecture analogous to LLMs. The SMM extends SCT to include counting practice, symbol (number) embedding, and gated attention. Similar to earlier work, the SMM demonstrates constructive and destructive interference between counting and addition, and the ``wave-like'' use of finger-counting as sum recall improves. We plan to extend the SMM to later aspects of the decades-long SCT program, including adaptive strategy choice and eventually strategy discovery, providing a unified platform to investigate the understanding of numerical characteristics and relationships essential for mathematical reasoning -- as it can emerge in LLM-based agents.


A Fragile Number Sense: Probing the Elemental Limits of Numerical Reasoning in LLMs

Rahman, Roussel, Mishra, Aashwin Ananda

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated remarkable emergent capabilities, yet the robustness of their numerical reasoning remains an open question. While standard benchmarks evaluate LLM reasoning on complex problem sets using aggregated metrics, they often obscure foundational weaknesses. In this work, we probe LLM mathematical numeracy by evaluating performance on problems of escalating complexity, from constituent operations to combinatorial puzzles. We test several state-of-the-art LLM-based agents on a 100-problem challenge comprising four categories: (1) basic arithmetic, (2) advanced operations, (3) primality checking, and (4) the Game of 24 number puzzle. Our results show that while the agents achieved high accuracy on the first three categories, which require deterministic algorithmic execution, they consistently failed at the number puzzle, underlining its demand for a heuristic search over a large combinatorial space to be a significant bottleneck. These findings reveal that the agents' proficiency is largely confined to recalling and executing known algorithms, rather than performing generative problem-solving. This suggests their apparent numerical reasoning is more akin to sophisticated pattern-matching than flexible, analytical thought, limiting their potential for tasks that require novel or creative numerical insights.


Large Language Models in Numberland: A Quick Test of Their Numerical Reasoning Abilities

Rahman, Roussel

arXiv.org Artificial Intelligence

An essential element of human mathematical reasoning is our number sense -- an abstract understanding of numbers and their relationships -- which allows us to solve problems involving vast number spaces using limited computational resources. Mathematical reasoning of Large Language Models (LLMs) is often tested on high-level problems (such as Olympiad challenges, geometry, word problems, and puzzles), but their low-level number sense remains less explored. We introduce "Numberland," a 100-problem test to evaluate the numerical reasoning abilities of LLM-based agents. The tasks -- basic operations, advanced calculations (e.g., exponentiation, complex numbers), prime number checks, and the 24 game -- aim to test elementary skills and their integration in solving complex and uncertain problems. We evaluated five LLM-based agents: OpenAI's o1 and o1-mini, Google Gemini, Microsoft Copilot, and Anthropic Claude. They scored 74-95% on the first three tasks that allow deterministic steps to solutions. In the 24 game, which needs trial-and-error search, performance dropped to 10-73%. We tested the top 24 solver (o1 with 73% accuracy) on 25 harder problems, and its score fell to 27%, confirming search as a bottleneck. These results, along with the types of mistakes, suggest a fragile number of LLMs, which is a bit surprising given their prowess in challenging benchmarks. The limits of LLM numerical reasoning highlight the scope of simple, targeted tests to evaluate and explain LLM math skills to ensure safe use.


Teach CLIP to Develop a Number Sense for Ordinal Regression

Du, Yao, Zhai, Qiang, Dai, Weihang, Li, Xiaomeng

arXiv.org Artificial Intelligence

Ordinal regression is a fundamental problem within the field of computer vision, with customised well-trained models on specific tasks. While pre-trained vision-language models (VLMs) have exhibited impressive performance on various vision tasks, their potential for ordinal regression has received less exploration. In this study, we first investigate CLIP's potential for ordinal regression, from which we expect the model could generalise to different ordinal regression tasks and scenarios. Unfortunately, vanilla CLIP fails on this task, since current VLMs have a well-documented limitation of encapsulating compositional concepts such as number sense. We propose a simple yet effective method called NumCLIP to improve the quantitative understanding of VLMs. We disassemble the exact image to number-specific text matching problem into coarse classification and fine prediction stages. We discretize and phrase each numerical bin with common language concept to better leverage the available pre-trained alignment in CLIP. To consider the inherent continuous property of ordinal regression, we propose a novel fine-grained cross-modal ranking-based regularisation loss specifically designed to keep both semantic and ordinal alignment in CLIP's feature space. Experimental results on three general ordinal regression tasks demonstrate the effectiveness of NumCLIP, with 10% and 3.83% accuracy improvement on historical image dating and image aesthetics assessment task, respectively. Code is publicly available at https://github.com/xmed-lab/NumCLIP.


Large-scale Generative AI Models Lack Visual Number Sense

Testolin, Alberto, Hou, Kuinan, Zorzi, Marco

arXiv.org Artificial Intelligence

Humans can readily judge the number of objects in a visual scene, even without counting, and such a skill has been documented in a variety of animal species and in babies prior to language development and formal schooling. Numerical judgments are error-free for small sets, while for larger collections responses become approximate, with variability increasing proportionally to the target number. This response pattern is observed for items of all kinds, despite variation in object features (such as color or shape), suggesting that our visual number sense relies on abstract representations of numerosity. Here, we investigated whether generative Artificial Intelligence (AI) models based on large-scale transformer architectures can reliably name the number of objects in simple visual stimuli or generate images containing a target number of items in the 1-10 range. Surprisingly, none of the foundation models considered performed in a human-like way: They all made striking errors even with small numbers, the response variability often did not increase in a systematic way, and the pattern of errors varied with object category. Our findings demonstrate that advanced AI systems still lack a basic ability that supports an intuitive understanding of numbers, which in humans is foundational for numeracy and mathematical development.


Scientists discover a brain circuit that boosts maths skills in children

Daily Mail - Science & tech

Scientists have discovered a brain circuit that boosts maths skills in children and could even be targeted to improve learning. The circuit triggers an area near the back of the head known as the IPS (intraparietal sulcus), which is involved in processing figures, and is linked to the hippocampus where memories are stored. Before children can learn to add and subtract, they must learn which abstract symbol, like '4' or '6', represents which quantity, a skill also known as'number sense'. Experts know the IPS plays a role in number processing but the circuits involved in learning number sense had remained a mystery until now. Lead author Dr Hyesang Chang, of Stanford University, California, said: 'Mathematical skill development relies on number sense, the ability to discriminate between quantities.


A Number Sense as an Emergent Property of the Manipulating Brain

Kondapaneni, Neehar, Perona, Pietro

arXiv.org Artificial Intelligence

The ability to understand and manipulate numbers and quantities emerges during childhood, but the mechanism through which this ability is developed is still poorly understood. In particular, it is not known whether acquiring such a {\em number sense} is possible without supervision from a teacher. To explore this question, we propose a model in which spontaneous and undirected manipulation of small objects trains perception to predict the resulting scene changes. We find that, from this task, an image representation emerges that exhibits regularities that foreshadow numbers and quantity. These include distinct categories for zero and the first few natural numbers, a notion of order, and a signal that correlates with numerical quantity. As a result, our model acquires the ability to estimate the number of objects in the scene, as well as {\em subitization}, i.e. the ability to recognize at a glance the exact number of objects in small scenes. We conclude that important aspects of a facility with numbers and quantities may be learned without explicit teacher supervision.


Machine Number Sense: A Dataset of Visual Arithmetic Problems for Abstract and Relational Reasoning

Zhang, Wenhe, Zhang, Chi, Zhu, Yixin, Zhu, Song-Chun

arXiv.org Artificial Intelligence

As a comprehensive indicator of mathematical thinking and intelligence, the number sense (Dehaene 2011) bridges the induction of symbolic concepts and the competence of problem-solving. To endow such a crucial cognitive ability to machine intelligence, we propose a dataset, Machine Number Sense (MNS), consisting of visual arithmetic problems automatically generated using a grammar model--And-Or Graph (AOG). These visual arithmetic problems are in the form of geometric figures: each problem has a set of geometric shapes as its context and embedded number symbols. Solving such problems is not trivial; the machine not only has to recognize the number, but also to interpret the number with its contexts, shapes, and relations (e.g., symmetry) together with proper operations. We benchmark the MNS dataset using four predominant neural network models as baselines in this visual reasoning task. Comprehensive experiments show that current neural-network-based models still struggle to understand number concepts and relational operations. We show that a simple brute-force search algorithm could work out some of the problems without context information. Crucially, taking geometric context into account by an additional perception module would provide a sharp performance gain with fewer search steps. Altogether, we call for attention in fusing the classic search-based algorithms with modern neural networks to discover the essential number concepts in future research.


An AI System Spontaneously Develops Baby-Like Ability to Gauge Big and Small

#artificialintelligence

Training software that emulates brain networks to identify dog breeds or sports equipment is by now old news. But getting such an AI network to learn a process on its own that is innate to early child development is truly novel. In a paper published Wednesday in Science Advances, a neural network distinguished between different quantities of things, even though it was never taught what a number is. The neural net reprised a cognitive skill innate to human babies, monkeys and crows, among others. Without any training, it suddenly could tell the difference between larger and smaller amounts--a skill called numerosity, or number sense.


A new AI acquired humanlike 'number sense' on its own Science News

#artificialintelligence

Artificial intelligence can share our natural ability to make numeric snap judgments. Researchers observed this knack for numbers in a computer model composed of virtual brain cells, or neurons, called an artificial neural network. After being trained merely to identify objects in images -- a common task for AI -- the network developed virtual neurons that respond to specific quantities. These artificial neurons are reminiscent of the "number neurons" thought to give humans, birds, bees and other creatures the innate ability to estimate the number of items in a set (SN: 7/7/18, p. 7). This intuition is known as number sense.